Dyna: Extending Datalog for Modern AI
نویسندگان
چکیده
Modern statistical AI systems are quite large and complex; this interferes with research, development, and education. We observe that most of the computation involves solving systems of equations of a certain kind. Specifically, recursive queries look up and aggregate relevant or potentially relevant values. If the results of these queries are memoized for reuse, the memos may need to be updated through change propagation. In-memory database systems can do a lot to support this work in a generic way. Through examples, we show that a wide variety of AI algorithms can be concisely captured by writing down systems of equations in an expressive declarative notation that generalizes Datalog, independent of the many strategies that could be used to actually solve those systems. We use these examples to motivate certain extensions to Datalog, which are connected to functional and object-oriented programming paradigms. 1 Why a New Data-Oriented Language for AI? Modern AI systems are frustratingly big, making them time-consuming to engineer and difficult to modify. In this paper, we describe our work toward a declarative language that was motivated originally by various use cases in AI. Our goal is to make it easier to specify a wide range of new systems that are more or less in the mold of existing AI systems. Our declarative language should simplify inferential computation in the same way that the declarative language of regular expressions has simplified string pattern matching and transduction. ? This material is based on work supported by the National Science Foundation under Grants No. 0347822 and 0964681 to the first author, and by a graduate fellowship to the second author from the Human Language Technology Center of Excellence, Johns Hopkins University. We thank Wren N. G. Thornton and John Blatz for many stimulating discussions. We also thank Yanif Ahmad, Adam Teichert, Jason Smith, Nicholas Andrews, and Veselin Stoyanov for timely comments on parts of the presentation. 1 Our own AI research is mainly on natural language processing, but as we show here, our observations and approach apply to other AI domains as well. All areas of AI have become data-intensive, owing in part to the pervasiveness of statistical modeling and machine learning. A system’s extensional data (inputs) include not only current sensory input but also background knowledge, large collections of training examples, and parameters trained from past experience. The intensional data (intermediate results and outputs) include combinatorially many possible analyses and conclusions derived from the inputs. Each AI system usually builds and maintains its own custom data structures, so that it can efficiently query and update the current state of the system. Although many conceptual ideas are reused across AI, each implemented system tends to include its own specialized code for storage and inference, specialized to the data and computations used by that system. This turns a small mathematical abstraction into a large optimized implementation. It is difficult to change either the abstract computation or the storage and execution strategy because they are intertwined throughout the codebase. This also means that reusable general strategies have to be instantiated anew for each implemented system, and cannot even be easily described in an abstract way. As an alternative, we are working to develop an appealing declarative language, Dyna, for concise specification of algorithms, with a compiler that turns such specifications into efficient code for storage and inference. Our goal is to produce a language that practitioners will actually use. The heart of this long paper is the collection of suggestive Dyna code examples in §3.1. Readers are thus encouraged to browse at their leisure through Figures 1–12, which are relatively self-contained. Readers are also welcome to concentrate on the main flow of the paper, skipping over details that have been relegated for this reason to footnotes, figures, and appendices. 1.1 AI and Databases Today Is a new language necessary? That is, why don’t AI researchers already use database systems to manage their data [29]? After all, any procedural AI program is free to store its data in an external database. It could use Datalog or SQL to express queries against the current state of a database, perform some procedural computation on the results, and then store the results back to the database. Unfortunately, there is rather little in most AI systems that looks like typical database queries: – Queries in a standard language like Datalog or SQL are not expressive enough for any one query to capture the entire AI computation. These languages allow only restricted ways to query and combine data. The restrictions are intended to guarantee that each query terminates, runs in polynomial time, and has a single well-defined answer. Yet the overall AI algorithm may not be able to make those guarantees anyway—so the effect of the restrictions is only to partition the algorithm artificially into many smaller queries. This limits the opportunities for the database system itself to plan, rearrange, and parallelize computations. – It may be inefficient to implement the algorithm in terms of database queries. Standard database implementations are good at large queries. They are intended to handle large, usually disk-backed, long-lived, read-mostly datasets that can be easily represented as tuples. Due to the high latency of disk access, they focus on supporting computations on large sets of records at a time. By contrast, AI systems typically work with lots of smaller, in-memory, ephemeral, write-heavy data sets often accessed at the level of individual records. For example, upon creating a promising hypothesis, the AI system might try to score it or extend it or compute its consequences, which involves looking up and storing individual records related to that specific hypothesis. Channeling these record-at-a-time queries and updates through a standard database would have considerable overhead. At the other extreme, one might try to reorganize the computation into coarser set-at-a-time queries where the database system will shine; but while such batch computations are a better fit to disk-based database systems, and also have the advantage that they can amortize work across many records, they may also do extra work that would be skipped by record-at-a-time strategies (by materializing larger relations that may include records that turn out not to be needed). – Standard database languages do not support features for programming-inthe-large, such as modules, structured objects, or inheritance. In this setting, switching from a data structure library to a relational database management system is likely to hurt performance without significantly reducing the implementation burden. 1.2 A Declarative Alternative Our approach instead eliminates most of the procedural program, instead specifying its computations declaratively. We build on Datalog to propose a convenient, elegantly concise notation for specifying the systems of equations that relate intensional and extensional data. This is the focus of §2, beginning with a review of ordinary Datalog in §2.1. A program in our Dyna language specifies what we call a dynabase, which is a kind of deductive database. Recall that a deductive database [38,168] contains not only extensional relations but also rules (usually Datalog rules or some other variant on Horn clauses) that define additional intensional relations. Our term “dynabase” emphasizes that our deductive databases are dynamic: they can be declaratively extended into new dynabases that have modified extensional data, with consequent differences in the intensional data. Also, one dynabase can be defined in terms of others, supporting modularity (§2.7). Because a Dyna program merely specifies a dynabase, it has no serial I/O or side effects. How, then, are dynabases used in a procedural environment? A running process written in one’s favorite procedural language, which does have I/O and side effects, can create a dynabase and update it serially by adding extensional data (§4.6). At any time, the process can query the dynabase to retrieve either the current extensional data, or intensional data that are defined in terms of the extensional data. As the process updates the extensional data, the intensional data that depend on it (possibly in other dynabases) are automatically maintained, as in a spreadsheet. Carrying out the query and update operations requires the “heavy computational lifting” needed in AI for search, deduction, abduction, message passing, etc. However, the needed computations are specified only declaratively and at a high level of abstraction. They are carried out by the Dyna execution engine (eagerly or lazily) as needed to serve the process. Dyna extends Datalog in several ways. In particular, Dyna is Turing-complete, so that the full computation needed by an AI system can be triggered by a single query against a dynabase. It is not necessary to specify which data to look up when, or whether or where to store the results. Essentially, a Dyna program is a set of equational schemata that are similar to Datalog rules with (non-stratified) negation and aggregation. These schemata together with the extensional data define a possibly infinite system of equations, and the queriable “contents” of the dynabase come from a solution to this system. We give a gentle introduction in §2.3 and sketch a provisional semantics in the appendix (§A). The ways in which Dyna relaxes Datalog’s restrictions are listed in §2.4. The resulting Turing-completeness gives greater freedom to both the Dyna programmer and the execution model, along with greater responsibility. We justify these and other extensions by way of various examples from AI in §3. Dyna also includes programming language features that improve its usability, such as typing, function evaluation, encapsulation, inheritance, and reflection. Finally, Dyna’s syntax for aggregation is very concise (even compared to other logic notations, let alone explicit loops) because its provable items have arbitrary values, not just truth values. Evaluating items in place makes it possible to write equations quite directly, with arithmetic and nested function evaluation. As Figures 1–12 illustrate, Dyna programs are startlingly short relative to more traditional, procedural versions. They naturally support record-at-a-time execution strategies (§2.6), as well as automatic differentiation (§3.1) and change propagation (§4.3), which are practically very important. They are modular and can be easily integrated with one another into larger programs (§2.7). Finally, they do not specify any particular storage or execution strategies, leaving opportunities for both automatic and user-directed optimizations that preserve correctness (§5.3). 1.3 Storage and Execution Strategies We argue in §4.1 that traditional (procedural) approaches to statistical AI are showing strain under modern workloads and that this declarative approach stands to ease the burden of both research and development. We discuss in the rest of §4 how Dyna addresses a number of practical difficulties in the AI 2 A procedural process will therefore see changes when it queries a dynabase again. Alternatively, it may make a continuous query, whereby it is notified of updates to the query result (§4.6). community, giving more details in §4.6 of how to compute using declarative dynabases in a procedural environment. In this paper, we focus on the expressivity and uses of the Dyna language, as a user of Dyna would. From this point of view, the underlying computation order, indexing, and storage are distractions from a Dyna program’s fundamentally declarative specification, and are relegated to an execution model—just as ordinary Datalog or SQL is a declarative language that leaves query optimization up to the database engine. Actually computing and updating intensional data under a Dyna program may involve recursive internal queries and other work. However, this happens in some implementation-dependent order that can be tuned manually or automatically without affecting correctness. The natural next questions are from an implementor’s point of view. They concern this query and update planning, as well as physical design. How do we systematize the space of execution strategies and optimizations? Given a particular Dyna program and workload, can a generic Dyna engine discover the algorithms and data structures that an expert would choose by hand? We have been studying many interesting cases that are beyond the scope of this paper. However, §5 briefly outlines some of the many issues regarding efficient computation, beginning with an earlier prototype implementation. By showing in this paper that Dyna is capable of describing a wide range of computations, we mean to argue that finding efficient execution strategies for Dyna constitutes a substantial general program of research on algorithms for AI and logic programming. After all, one would like a declarative solution of a given problem to exploit the relevant tricks used by the state-of-the-art procedural solutions. But then it is necessary to generalize these tricks into strategies that can be incorporated more generally into the Dyna runtime engine or encapsulated as general Dyna-to-Dyna program transformations [67,46]. These strategies may then be applied in new contexts. Building a wide range of tricks and strategies into the Dyna environment also raises the issue of how to manually specify and automatically tune strategies that work well on a particular workload. 2 Basic Features of the Language Our goal in this section is to sketch just enough of Dyna that readers will be able to follow our AI examples in the next section. After quickly reviewing Datalog, we explain how Dyna augments Datalog by proving that terms have particular values, rather than merely proving that they are true; by relaxing certain restrictions; and by introducing useful notions of encapsulation and inheritance. Formal details are confined to an appendix (§A). 3 More restricted declarative problems already have communities that work on efficient execution: propositional satisfiability, integer linear programming, queries and physical design in relational databases, etc. 2.1 Background: Datalog Datalog [37] is a language—a concrete syntax—for defining named, flat relations. The (slightly incorrect) statement “Two people are siblings if they share a parent” can be precisely captured by a rule such as sibling(A,B) :parent(C,A), parent(C,B). (1) which may be read as “A is a sibling of B if, for some C, C is a parent of A and C is a parent of B.” Formally, capitalized identifiers such as A,B,C denote universally quantified variables, and the above rule is really a schema that defines infinitely many propositional implications such as sibling(alice,bob) :parent(charlie,alice), parent(charlie,bob). (2) where alice, bob, and charlie are constants. (Thus, (2) is one of many possible implications that could be used to prove sibling(alice,bob).) Rules can also mention constants directly, as in parent(charlie,alice). parent(charlie,bob). (3) Since the rules (3) also happen to have no conditions (no “:. . . ” part), they are simply facts that directly specify part of the parent relation. Taking a relational database view, they specify records in a two-column table (binary relation) called parent. The rule (1) defines another two-column table, sibling, by joining parent to itself on its first column and projecting that column out of the result. Informally, we may regard parent (3) as extensional and sibling (1) as intensional, but Datalog as a language does not have to distinguish these cases. Datalog also does not specify whether the sibling relation should be materialized or whether its individual records should merely be computed as needed. As this example suggests, it is simple in Datalog to construct new relations from old ones. Datalog rules can easily capture relational algebra operators such as join, project, and select. They also permit recursive definitions. Datalog imposes the following syntactic restrictions to ensure that the defined relations are finite [37]: – Flatness: Terms in a rule must include exactly one level of parentheses. This prevents recursive structure-building rules like
منابع مشابه
Dyna: Extending Datalog For Modern AI (full version)
Modern statistical AI systems are quite large and complex; this interferes with research, development, and education. We point out that most of the computation involves database-like queries and updates on complex views of the data. Specifically, recursive queries look up and aggregate relevant or potentially relevant values. If the results of these queries are memoized for reuse, the memos may...
متن کاملSemi-Inflationary DATALOG: A declarative database language with procedural features
This paper presents a rule-based database language which extends stratified DATALOG by adding a controlled form of inflationary fixpoint, immersed in a context of classical stratified negation with least fixpoint. The proposed language, called Semi-Inflationary DATALOG (DATALOG for short), smoothly combines the declarative purity of stratified negation with the procedural style of the inflation...
متن کاملIntegrated Architectures for Learning , Planning , and ReactingBased
This paper extends previous work with Dyna, a class of architectures for intelligent systems based on approximating dynamic programming methods. Dyna architectures integrate trial-and-error (reinforcement) learning and execution-time planning into a single process operating alternately on the world and on a learned model of the world. In this paper, I present and show results for two Dyna archi...
متن کاملIntegrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming
This paper extends previous work with Dyna a class of architectures for intelligent systems based on approximating dynamic program ming methods Dyna architectures integrate trial and error reinforcement learning and execution time planning into a single process operating alternately on the world and on a learned model of the world In this paper I present and show results for two Dyna archi tect...
متن کاملSituation ActionPlanner Situation ActionReactivePolicy A ) B ) C ) Situation
Dyna is an AI architecture that integrates learning, planning, and reactive execution. Learning methods are used in Dyna both for compiling planning results and for updating a model of the eeects of the agent's actions on the world. Planning is incre-mental and can use the probabilistic and ofttimes incorrect world models generated by learning processes. Execution is fully reactive in the sense...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010